cognitiveclass.ai logo

Basic Plotly Charts

Estimated time needed: 45 minutes

Objectives

After completing this lab, you will be able to:

Plotly graph objects and Plotly express libraries to plot different types of charts

Plotly Libraries

plotly.graph_objects: This is a low level interface to figures, traces and layout. The Plotly graph objects module provides an automatically generated hierarchy of classes ( figures, traces, and layout) called graph objects. These graph objects represent figures with a top-level class plotly.graph_objects.Figure.

plotly.express: Plotly express is a high-level wrapper for Plotly. It is a recommended starting point for creating the most common figures provided by Plotly using a simpler syntax. It uses graph objects internally. Now let us use these libraries to plot some charts We will start with plotly_graph_objects to plot line and scatter plots

Note: You can hover the mouse over the charts whenever you want to view any statistics in the visualization charts

Exercise I: Get Started with Different Chart types in Plotly

1. Scatter Plot:

A scatter plot shows the relationship between 2 variables on the x and y-axis. The data points here appear scattered when plotted on a two-dimensional plane. Using scatter plots, we can create exciting visualizations to express various relationships, such as:

However in the previous output title, x-axis and y-axis labels are missing. Let us use the update_layout function to update the title and labels.

Inferences:

From the above plot we find that the Income of a person is not correlated with age. We find that as the age increases the income may or not decrease.

2. Line Plot:

A line plot shows information that changes continuously with time. Here the data points are connected by straight lines. Line plots are also plotted on a two dimensional plane like scatter plots. Using line plots, we can create exciting visualizations to illustrate:

Inferences:

From the above plot we find that the sales is the highest in the month of May and then there is a decline in sales.

We will now use plotly express library to plot the other graphs

3.Bar Plot:

A bar plot represents categorical data in rectangular bars. Each category is defined on one axis, and the value counts for this category are represented on another axis. Bar charts are generally used to compare values.We can use bar plots in visualizing:

In plotly express we set the axis values and the title within the same function call px.<graphtype>(x=<xaxis value source>,y=<y-axis value source>,title=<appropriate title as a string>).In the below code we use px.bar( x=grade_array, y=score_array, title='Pass Percentage of Classes').

Inferences:

From the above plot we find that Grade 8 has the lowest pass percentage and Grade 10 has the highest pass percentage

4.Histogram:

A histogram is used to represent continuous data in the form of bar. Each bar has discrete values in bar graphs, whereas in histograms, we have bars representing a range of values. Histograms show frequency distributions. We can use histograms to visualize:

Inferences

From this we can analyze that there are around 2 people who are at the height of 130cm and 45 people at the height of 160 cm

5. Bubble Plot:

A bubble plot is used to show the relationship between 3 or more variables. It is an extension of a scatter plot. Bubble plots are ideal for visualizing:

Inferences

The size of the bubble in the bubble chart indicates that Chicago has the highest crime rate when compared with the other 2 cities.

6.Pie Plot:

A pie plot is a circle chart mainly used to represent proportion of part of given data with respect to the whole data. Each slice represents a proportion and on total of the proportion becomes a whole. We can use bar plots in visualizing:

Inferences

From this pie chart we can find that the family expenditure is maximum for rent.

7.Sunburst Charts:

Sunburst charts represent hierarchial data in the form of concentric circles. Here the innermost circle is the root node which defines the parent, and then the outer rings move down the hierarchy from the centre. They are also called radial charts.We can use them to plot

Inferences

It is found that here the innermost circle Eve represents the parent and the second outer circle represents his childrent Cain,Seth and so on.Further the outermost circle represents his grandchildren Enoch and Enos

II- Practice Exercises: Apply your Plotly Skills to an Airline Dataset

The Reporting Carrier On-Time Performance Dataset contains information on approximately 200 million domestic US flights reported to the United States Bureau of Transportation Statistics. The dataset contains basic information about each flight (such as date, time, departure airport, arrival airport) and, if applicable, the amount of time the flight was delayed and information about the reason for the delay. This dataset can be used to predict the likelihood of a flight arriving on time.

Preview data, dataset metadata, and data glossary here.

Read Data

It would be interesting if we visually capture details such as

1. Scatter Plot

Let us use a scatter plot to represent departure time changes with respect to airport distance

This plot should contain the following

Double-click here for hint.

Double-click here for the solution.

Inferences

It can be inferred that there are more flights round the clock for shorter distances. However, for longer distance there are limited flights through the day.

2. Line Plot

Let us now use a line plot to extract average monthly arrival delay time and see how it changes over the year.

This plot should contain the following

Double-click here for hint.

Double-click here for the solution.

Inferences

It is found that in the month of June the average monthly delay time is the maximum

3. Bar Chart

Let us use a bar chart to extract number of flights from a specific airline that goes to a destination

This plot should contain the following

Double-click here for hint.

Double-click here for the solution.

Inferences

It is found that maximum flights are to destination state CA which is around 68 and there is only 1 flight to destination state VT

4. Histogram

Let us represent the distribution of arrival delay using a histogram

This plot should contain the following

Double-click here for hint.

Double-click here for the solution.

Inferences

It is found that there is only max of 5 flights with an arrival delay of 50-54 minutes and around 17 flights with an arrival delay of 20-25 minutes

5. Bubble Plot

Let use a bubble plot to represent number of flights as per reporting airline

This plot should contain the following

Double-click here for hint.

Double-click here for the solution.

Inferences

It is found that the reporting airline WN has the highest number of flights which is around 86

6. Pie Chart

Let us represent the proportion of distance group by month (month indicated by numbers)

This plot should contain the following

Double-click here for hint.

Double-click here for the solution.

Inferences

It is found that February month has the highest distance group proportion

7. SunBurst Charts

Let us represent the hierarchical view in othe order of month and destination state holding value of number of flights

This plot should contain the following

Double-click here for hint.

Double-click here for the solution.

Inferences

Here the Month numbers present in the innermost concentric circle is the root and for each month we will check the number of flights for the different destination states under it.

Summary

Congratulations for completing your lab.

In this lab, you have learnt how to use plotly.graph_objects and plotly.express for creating plots and charts.

Author(s)

Saishruthi Swaminathan

Lakshmi Holla

Other Contributor(s)

Lavanya T S

Changelog

Date Version Changed by Change Description
07-02-2023 1.1 Lakshmi Holla Updated lab

© IBM Corporation 2023. All rights reserved.